A robust stamp detection framework on degraded documents
نویسندگان
چکیده
Detecting documents with a certain stamp instance is an effective and reliable way to retrieve documents associated with a specific source. However, this unique problem has essentially remained unaddressed. In this paper, we present a novel stamp detection framework based on parameter estimation of connected edge features. Using robust basic-shape detectors, the approach is effective for stamps with analytically shaped contours, when only limited samples are available. For elliptic/circular stamps, it efficiently exploits the orientation information from pairs of edge points to determine its center position and area, without computing all the five parameters of an ellipse. In our approach, we considered the set of unique characteristics of stamp patterns. Specifically, we introduced effective algorithms to address the problem that stamps often spatially overlay their background contents. These give our approach significant advantages in detection accuracy and computation complexity over traditional Hough transform method in locating candidate ellipse regions. Experimental results on real degraded documents demonstrated the robustness of this retrieval approach on large document database, which consists of both printed text and handwritten notes.
منابع مشابه
Robust Stamps Detection and Classification by Means of General Shape Analysis
The article presents current challenges in stamp detection problem. It is a crucial topic these days since more and more traditional paper documents are being scanned in order to be archived, sent through the net or just printed. Moreover, an electronic version of paper document stored on a hard drive can be taken as forensic evidence of possible crime. The main purpose of the method presented ...
متن کاملStamp detection in scanned documents
The article presents current challenges in stamp detection problem. It is a crucial topic these days since more and more traditional paper documents are being scanned in order to be archived, sent through the net or just printed. Moreover, an electronic version of paper document stored on a hard drive can be taken as forensic evidence of possible crime. The main purpose of the method presented ...
متن کاملDegraded Document Analysis and Extraction of Original Text Document: An Approach without Optical Character Recognition
Document Image Analysis recognizes text and graphics in documents acquired as images. An approach without Optical Character Recognition (OCR) for degraded document image analysis has been adopted in this paper. The technique involves document imaging methods such as Image Fusing and Speeded Up Robust Features (SURF) Detection to identify and extract the degraded regions from a set of document i...
متن کاملExtraction of Original Text Document from a Set of Degraded Text Documents from the Same Source
Information extraction is the task of extracting structured data from a degraded document. It includes data extraction such as text, image or graphics from the sources such as an image, video or documents. Text detection and extraction from the degraded document finds application in wide range of study. In this paper, an Optical Character Recognition less (OCR-less) method of obtaining an origi...
متن کاملDocument seal detection using GHT and character proximity graphs
This paper deals with automatic detection of seal (stamp) from documents with cluttered background. Seal detection involves a difficult challenge due to its multi-oriented nature, arbitrary shape, overlapping of its part with signature, noise, etc. Here, a seal object is characterized by scale and rotation invariant spatial feature descriptors computed from recognition result of individual conn...
متن کامل